Robust speaker identification using posterior union models

نویسندگان

  • Ji Ming
  • Darryl Stewart
  • Philip Hanna
  • Pat Corr
  • Francis Jack Smith
  • Saeed Vaseghi
چکیده

This paper investigates the problem of speaker identification in noisy conditions, assuming that there is no prior knowledge about the noise. To confine the effect of the noise on recognition, we use a multi-stream approach to characterize the speech signal, assuming that while all of the feature streams may be affected by the noise, there may be some streams that are less severely affected and thus still provide useful information about the speaker. Recognition decisions are based on the feature streams that are uncontaminated or least contaminated, thereby reducing the effect of the noise on recognition. We introduce a novel statistical method, the posterior union model, for selecting reliable feature streams. An advantage of the union model is that knowledge of the structure of the noise is not needed, thereby providing robustness to time-varying unpredictable noise corruption. We have tested the new method on the TIMIT database with additive corruption from real-world nonstationary noise; the results obtained are encouraging.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker recognition based on variational Bayesian method

This paper presents a speaker identification system based on Gaussian Mixture Models (GMM) using the variational Bayesian method. Maximum Likelihood (ML) and Maximum A Posterior (MAP) are well-known methods for estimating GMM parameters. However, the overtraining problem occurs with insufficient data due to a point estimate of model parameters. The Bayesian approach estimates a posterior distri...

متن کامل

Robust text-independent speaker identification using Gaussian mixture speaker models

This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...

متن کامل

Feature vector classification by threshold for speaker identification

This paper describes a new feature vector classification method for speaker identification. Purpose of this paper is constructing robust speaker models which only use meaningful feature vectors and discard confusing feature vectors. To construct robust speaker model, proposed method classifies feature vectors using log-likelihood estimation. Experimental results, with various segments ranging f...

متن کامل

KL realignment for speaker diarization with multiple feature streams

This paper aims at investigating the use of Kullback-Leibler (KL) divergence based realignment with application to speaker diarization. The use of KL divergence based realignment operates directly on the speaker posterior distribution estimates and is compared with traditional realignment performed using HMM/GMM system. We hypothesize that using posterior estimates to re-align speaker boundarie...

متن کامل

Automatic Language Identification with Discriminative Language Characterization Based on SVM

Robust automatic language identification (LID) is the task of identifying the language from a short utterance spoken by an unknown speaker. The mainstream approaches include parallel phone recognition language modeling (PPRLM), support vector machine (SVM) and the general Gaussian mixture models (GMMs). These systems map the cepstral features of spoken utterances into high level scores by class...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003